Conversation
caf4dea to
9df4a48
Compare
Pull Request Test Coverage Report for Build 23353160193Details
💛 - Coveralls |
f641887 to
99ddcd5
Compare
73275d0 to
d9e0b3e
Compare
|
@cursor review |
d9e0b3e to
0690684
Compare
0690684 to
a870ef1
Compare
45db360 to
2118254
Compare
6dcb742 to
fc386c2
Compare
fc386c2 to
d9b6fa5
Compare
6b54329 to
a2c5124
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
a2c5124 to
57366f9
Compare
57366f9 to
dbd3c52
Compare
|
I wonder if you could gate this behind a feature flag so we could choose if we want it. Or even better, a generic so we could support both with and without the performance penalty. I can see this helping users with low memory who now may be able to stay in sync whereas before they couldn't because they couldn't get enough parallelization. But we're making people who may have tons of RAM also pay the performance penalty. So it a trade-off that only some might be willing to make. |
It's gated behind
The performance penalty comes from recording the Allocator checkpoints. We don't do that when the flag is not set. |
…ply invocation results in a simple atom being returned. Everything allocated by the invocation can be freed
6859b78 to
b039234
Compare
This PR implements a simple, call stack-based, garbage collection.
overview
The observation is that any operator that returns an atom that was allocated before the snapshot was taken (i.e. picking an atom out of the environment) or a simple atom. e.g.
NILor1effectively "launder" all computation (and allocations) that went into computing its arguments into a very small amount of data.When this happens we can reset the
Allocatorto the state it was in before we invoked the operator, as long as we preserve the return value.This requires us to pre-emptively store
Allocatorcheckpoints just in case the operator returns a simple atom. This adds overhead and we'll have to make a judgement call on whether we think it's worth it.The cost of recording Allocator snapshots has been mitigated by:
Checkpoint, calledTransparentCheckpointthat's smaller. It usesu32instead ofusizeand it does not record the "ghost" counters.src/chia_dialect.rsingc_candidate().restoring the
AllocatorWhen restoring the allocator, it's important that we preserve the same behavior as we have today, with regards to counters. The atom- and pair counters are consensus critical, since we have upper limits on them. So when we restore the allocator, we have to do it transparently, and preserve counters.
The way this is done is to increment the ghost-counters by the same amount as we are freeing. This acts as if the atoms and pairs were never freed, just like it works today.
This is one important distinction between the existing
checkpoint()->restore_checkpoint()functions and the newtransparent_checkpoint()->restore_transparent_checkpoint()functions. But note that the transparent version is a subset of the existing behavior.benchmarking
To benchmark this I picked some of the most expensive block generators from mainnet. Expensive in the number of atoms, pairs and heap-bytes allocated, but also in execution run-time.
Extending the
analyze-chain.rstool, I picked the generators at the following heights:running these generators before and after this change gave me the following results:
comparison
run-time
peak atom count
peak pair count
peak heap size
before (main)
after (with this change)
Note
High Risk
High risk because it changes core
Allocatorcheckpoint/restore semantics and therun_programexecution loop, which can affect consensus-critical accounting (atom/pair/heap counters) and runtime behavior under specific opcodes.Overview
Implements an optional, call-stack based garbage-collection optimization in
run_program: for selected opcodes, the interpreter now takes a lightweight allocator snapshot before evaluating arguments and may restore it after the op returns, preserving the return value while reclaiming temporary allocations.This adds
TransparentCheckpointplusmaybe_restore_with_node()inAllocatorto support restoring without changing consensus-critical allocation counters (using ghost accounting), and introducesDialect::gc_candidate()with a newClvmFlags::ENABLE_GCgate (implemented forChiaDialect, disabled forRuntimeDialect).Updates benches/tools/tests/fuzzing to exercise GC mode by default in several harnesses and adds a dedicated
garbage-collectionfuzz target that asserts identical cost/results and allocator accounting with and withoutENABLE_GC.Written by Cursor Bugbot for commit b039234. This will update automatically on new commits. Configure here.